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Abstract: The coronavirus (COVID-19) pandemic is putting healthcare systems across the world 
under unprecedented and increasing pressure according to the World Health Organization (WHO). 
With the advances in computer algorithms and especially Artificial Intelligence, the detection of 
this type of virus in the early stages will help in fast recovery and help in releasing the pressure off 
healthcare systems. In this paper, a GAN with deep transfer learning for coronavirus detection in 
chest X-ray images is presented. The lack of datasets for COVID-19 especially in chest X-rays images 
is the main motivation of this scientific study. The main idea is to collect all the possible images for 
COVID-19 that exists until the writing of this research and use the GAN network to generate more 
images to help in the detection of this virus from the available X-rays images with the highest accuracy 
possible. The dataset used in this research was collected from different sources and it is available 
for researchers to download and use it. The number of images in the collected dataset is 307 images 
for four different types of classes. The classes are the COVID-19, normal, pneumonia bacterial, 
and pneumonia virus. Three deep transfer models are selected in this research for investigation. 
The models are the Alexnet, Googlenet, and Restnet18. Those models are selected for investigation 
through this research as it contains a small number of layers on their architectures, this will result 
in reducing the complexity, the consumed memory and the execution time for the proposed model. 
Three case scenarios are tested through the paper, the first scenario includes four classes from the 
dataset, while the second scenario includes 3 classes and the third scenario includes two classes. 
All the scenarios include the COVID-19 class as it is the main target of this research to be detected. 
In the first scenario, the Googlenet is selected to be the main deep transfer model as it achieves 
80.6% in testing accuracy. In the second scenario, the Alexnet is selected to be the main deep transfer 
model as it achieves 85.2% in testing accuracy, while in the third scenario which includes two classes 
(COVID-19, and normal), Googlenet is selected to be the main deep transfer model as it achieves 
100% in testing accuracy and 99.9% in the validation accuracy. All the performance measurement 
strengthens the obtained results through the research. 


Keywords: 2019 novel coronavirus; deep transfer learning; machine learning; COVID-19; SARS-CoV-2; 
convolutional neural network; GAN 
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1. Introduction 


In 2019, Wuhan is a commercial center of Hubei province in China that faced a flare-up of a novel 
2019 coronavirus that killed more than hundreds and infected over thousands of individuals within the 
initial days of the novel coronavirus pestilence. The Chinese researchers named the novel virus as the 
2019 novel coronavirus (2019-nCov) or the Wuhan virus [1]. The International Committee of Viruses 
titled the virus of 2019 as the Severe Acute Respiratory Syndrome CoronaVirus-2 (SARS-CoV-2) and 
the malady as Coronavirus disease 2019 (COVID-19) [2-4]. The subgroups of the coronaviruses family 
are alpha-CoV (a), beta-CoV (8), gamma-CoV (5), and delta-CoV (y) coronavirus. SARS-CoV-2 was 
announced to be an organ of the beta-CoV (8) group of coronaviruses. In 2003, the Kwangtung people 
were infected with a 2013 virus lead to the Severe Acute Respiratory Syndrome (SARS-CoV). SARS-CoV 
was assured as a family of the beta-CoV (8) subgroup and was title as SARS-CoV [5]. Historically, 
SRAS-CoV, across 26 countries in the world, infected more than 8000 individuals with a death rate of 
9%. Moreover, SARS-CoV-2 infected more than 750,000 individuals with a death rate of 4%, across 
150 states, untill the date of this lettering. It demonstrates that the broadcast rate of SARS-CoV-2 is 
higher than SRAS-CoV. The transmission ability is enhanced because of authentic recombination of S 
protein in the RBD region [6]. 

Beta-coronaviruses have caused malady to people that have had wild animals generally either in 
bats or rats [7,8]. SARS-CoV-1 and MERS-CoV (camel flu) were transmitted to people from some wild 
cats and Arabian camels respectively as shown in Figure 1. The sale and buy of unknown animals 
may be the provenance of coronavirus infection. The invention of the various progeny of pangolin 
coronavirus and their propinquity to SARS-CoV-2 suggests that pangolins should be a thinker as 
possible hosts of novel 2019 coronaviruses. Wild animals must be taken away from wild animal 
markets to stop animal coronavirus transmission [9]. Coronavirus transmission has been assured 
by World Health Organization (WHO) and by The Centers for Diseases of the US, with evidence of 
human-to-human conveyance from five different cases outside China, namely in Italy [10], US [11], 
Nepal [12], Germany [13], and Vietnam [14]. On 31 March 2020, SARS-CoV-2 confirmed more than 
750,000 cases, 150,000 recovered cases, and 35,000 death cases. Table 1 show some statistics about 
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Figure 1. Coronavirus transmission from animals to humans. 
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Table 1. SARS-CoV-2 statistics in some countries. 


Location Confirmed Recovered Deaths 
United States 164,345 5,945 3,171 
Italy 101,739 14,620 11,591 
Spain 94,417 19,259 8,269 
China 81,518 76,052 3305 
Germany 67,051 7635 682 
Iran 44,606 14,656 2898 
France 43,973 7202 3018 
United Kingdom 22,141 135 1408 


1.1. Deep Learning 


Nowadays, Deep Learning (DL) is a subfield of machine learning concerned with techniques 
inspired by neurons of the brain [16]. Today, DL is quickly becoming a crucial technology in image/video 
classification and detection. DL depends on algorithms for reasoning process simulation and data 
mining, or for developing abstractions [17]. Hidden deep layers on DL maps input data to labels 
to analyze hidden patterns in complicated data [18]. Besides their use in medical X-ray recognition, 
DL architectures are also used in other areas in the application of image processing and computer 
vision in medical. DL improves such a medical system to realize higher outcomes, widen illness scope, 
and implementing applicable real-time medical image [19,20] disease detection systems. Table 2 shows 
a series of major contributions in the field of the neural network to deep learning [21]. 


Table 2. Major contributions in the history of the neural network to deep learning [21,22]. 


Milestone/Contribution Year 
McCulloch-Pitts Neuron 1943, 
Perceptron 1958 
Backpropagation 1974 
Neocognitron 1980 
Boltzmann Machine 1985 
Restricted Boltzmann Machine 1986 
Recurrent Neural Networks 1986 
Autoencoders 1987 
LeNet 1990 
LSTM 1997 
Deep Belief Networks 2006 
Deep Boltzmann Machine 2009 


1.2. Generative Adversarial Network 


Generative Adversarial Network (GAN) is a class of deep learning models invented by Ian 
Goodfellow in 2014 [23]. GAN models have two main networks, called the generative network and 
discriminative network. The first neural network is the generator network, responsible for generating 
new fake data instances that look like training data. The discriminator tries to distinguish between real 
data and fake (artificially generated) data generated by the generator network as shown in Figure 2. 
The mission GANs models that generator network is to try fooling the discriminator network and the 
discriminator network tries to fight from being fooled [24-27]. 
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1.3. Convolution Neural Networks 


Convolutional Neural Networks (ConvNets or CNNs) are a category of deep learning techniques 
used primarily to recognize and classify the image. Convolutional Neural Networks have accomplished 
extraordinary success for medical image/video classification and detection. In 2012, Ciregan et al. 
and Krizhevsky and et al. [28,29] showed how CNNs based on Graphics Processing Unit (GPU) 
can enhance many vision benchmark records such as MNIST [30], Chinese characters [31], Arabic 
digits recognition [32], Arabic handwritten characters recognition [33], NORB (jittered, cluttered) [34], 
traffic signs [35], and large-scale ImageNet [36] benchmarks. In the following years, various advances 
in ConvNets further increased the accuracy rate on the image detection/classification competition 
tasks. ConvNets pre-trained models introduced significant improvements in succeeding in the annual 
challenges of ImageNet Large Scale Visual Recognition Competition (ILSVRC). Deep Transfer Learning 
(DTL) is a deep learning (DL) model that focuses on storing weights gained while solving one 
image classification and applying it to a related problem. Many DTL models were introduced like 
VGGNet [37], GoogleNet [38], ResNet [39], Xception [40], Inception-V3 [41] and DenseNet [42]. 

The novelty of this paper is conducted as follows: i) the introduced ConvNet models have 
end-to-end structure without classical feature extraction and selection methods. ii) We show that GAN 
is an effective technique to generate X-ray images. iii) Chest X-ray images are one of the best tools for 
the classification of SARS-CoV-2. iv) The deep transfer learning models have been shown to yield very 
high outcomes in the small dataset COVID-19. The rest of the paper is organized as follows. Section 2 
explores related work and determines the scope of this works. Section 3 discusses the dataset used in 
our paper. Section 4 presents the proposed models, while Section 5 illustrates the achieved outcomes 
and its discussion. Finally, Section 6 provides conclusions and directions for further research. 


2. Related Works 


This part conducts a survey on the recent scientific researches for applying machine learning and 
deep learning in the field of medical pneumonia and coronavirus X-ray classification. Classical image 
classification stages can be divided into three main stages: image preprocessing, feature extraction, 
and feature classification. Stephen et al. [43] proposed a new study of classifying and detect the 
presence of pneumonia from a collection of chest X-ray image samples based on a ConvNet model 
trained from scratch based on dataset [44]. The outcomes obtained were training loss = 12.88%, training 
accuracy = 95.31%, validation loss = 18.35%, and validation accuracy = 93.73%. 

In [45], the Authors introduced an early diagnosis system from Pneumonia chest X-ray images 
based on Xception and VGGI16. In this study, a database containing approximately 5800 frontal chest 
X-ray images introduced by Kermany et al [44] 1600 normal case, 4200 up-normal pneumonia case in 
the Kermany X-ray database. The trial outcomes showed that VGG-16 network better than X-ception 
network with a classification rate of 87%. Forasmuch X-ception network better than VGG-16 network 
by sensitivity 85%, precision 86% and recall 94%. X-ception network is more felicitous for classifying 
X-ray images than VGG-16 network. Varshni et al. [46] proposed pre-trained ConvNet models (VGG-16, 
Xception, Res50, Dense-121, and Dense-169) as feature-extractors followed by different classifiers 
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(SVM, Random Forest, k-nearest neighbors, Naive Bayes) for the detection of normal and abnormal 
pneumonia X-rays images. The prosaists used ChestX-ray14 introduced by Wang et al. [47]. 

Chouhan et al. [48] introduced an ensemble deep model that combines outputs from all transfer 
deep models for the classification of pneumonia using the connotation of deep learning. The Guangzhou 
Medical Center [44] database introduced a total of approximately 5200 X-ray images, divided to 1300 
X-ray normal, 3900 X-rays abnormal. The proposed model reached a miss-classification error of 3.6% 
with a sensitivity of 99.6% on test data from the database. Ref. [49] proposed a Compressed Sensing 
(CS) with a deep transfer learning model for automatic classification of pneumonia on X-ray images to 
assist the medical physicians. The dataset used for this work contained approximately 5850 X-ray data 
of two categories (abnormal /normal) obtained from Kaggle. Comprehensive simulation outcomes 
have shown that the proposed approach detects the classification of pneumonia (abnormal /normal) 
with 2.66% miss-classification. 

In this research, we introduced a transfer of deep learning models to classify COVID-19 X-ray 
images. To input adopting X-ray images of the chest to the convolutional neural network, we embedded 
the medical X-ray images using GAN to generate X-ray images. After that, a classifier is used to 
ensemble the outputs of the classification outcomes. The proposed transfer model was evaluated on 
the proposed dataset. 


3. Dataset 


The COVID-19 dataset [50] utilized in this research [51] was created by Dr. Joseph Cohen, 
a postdoctoral fellow at the University of Montreal. The Pneumonia [44] dataset Chest X-ray Images 
was used to build the proposed dataset. The dataset [52] is organized into two folders (train, test) and 
contains sub-folders for each image category (COVID-19/normal/pneumonia bacterial/ pneumonia 
virus). There are 306 X-ray images (JPEG) and four categories (COVID-19/normal/pneumonia bacterial/ 
pneumonia virus). The number of images for each class is presented in Table 3. Figure 3 illustrates 
samples of images used for this research. Figure 4 also illustrates that there is a lot of variation of image 
sizes and features that may reflect on the accuracy of the proposed model which will be presented in 
the next section. 


Table 3. Number of images for each class in the COVID-19 dataset. 


Dataset/Class Covid Normal Pneumonia_bac Pneumonia_vir Total 


Train 60 70 70 70 270 
Test 9 9 9 9 36 
Total 69 79 79 79 306 
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Figure 3. Samples of the used images in this research. 
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4. The Proposed Model 


Figure 4. The proposed GAN/deep transfer learning mode. 


The proposed model includes two main deep learning components, the first component is GAN 
and the second component is the deep transfer model. Figure 4 illustrates the proposed GAN/Deep 
transfer learning model. Mainly, the GAN used for the preprocessing phase while the deep transfer 
model used in the training, validation and testing phase. 

Algorithm 1 introduces the proposed transfer model in detail. Let D = {Alexnet, Googlenet, 
Resnet18} be the set of transfer models. Each deep transfer model is fine-tuned with the COVID-19 
X-ray Images dataset (X, Y); where X the set of N input data, each of size, 512 lengths x 512 widths, 
and Y have the identical class, Y = {y/y € {COVID-19; normal; pneumonia bacterial; pneumonia virus 
}}. The dataset divided to train and test, training set (Xtrain; Ytrain) for 90% percent for the training and 
then validation while 10% percent for the testing. The 90% percent was divided into 80% for training 
and 20% for the validation. The selection of 80% for the training and 20% in the validation proved it is 
efficient in many types of research such as [53-57]. The training data then divided into mini-batches, 
each of size n = 64, such that (X,; Yq) © (Xtrain? Ytrain); 9 = 1,2,..., N and iteratively optimizes the 
DCNN model d € D to reduce the functional loss as illustrated in Equation (1). 


CWw,X%)=+ Y cld(x,w),y), (1) 


xEX;,YEY; 


where d(x,w) is the ConvNet model that true label y for input x given w is a weight and c(.) is the 
multi-class entropy loss function. 

This research relied on the deep transfer learning CNN architectures to transfer the learning 
weights to reduce the training time, mathematical calculations and the consumption of the available 
hardware resources. There are several types of research in [53,58,59] tried to build their architecture, 
but those architecture are problem-specific and cannot fit the data presented in this paper. The used 
deep transfer learning CNN models investigated in this research are Alexnet [29], Resnet18 [39], 
Googlenet [60], The mentioned CNN models had a few numbers of layers if it is compared to large 
CNN models such as Xception [40], Densenet [42], and Inceptionresnet [61] which consist of 71, 201 and 
164 layers accordingly. The choice of these models will reflect on reducing the training time and the 
complexity of the calculations. 
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Algorithm 1 Introduced algorithm. 


1: Input data: COVID-19 Chest X-ray Images (X,Y); where Y = {y/y € {COVID-19; normal; pneumonia 
bacterial; pneumonia virus}} 


2: Output data: The transfer model that detected the COVID-19 Chest X-ray image x € X 
3:  Pre-processing steps: 
4: modify the X-ray input to dimension 512 height x 512 width 
5: Generate X-ray images using GAN 
6: | Mean normalize each X-ray data input 
vp download and reuse transfer models D = {Alexnet, Googlenet, Resnet18} 
8: Replace the last layer of each transfer model by (4 Xx 1) layer dimension. 
9: foreach Vd € Ddo 
10: ye =0.01 
11: for epochs = 1 to 20 do 
12: foreach mini-batch (Xj; Y;) € (Xtrain; Ytrain) do 
Modify the coefficients of the transfer d(-) 
if the error rate is increased for five epochs then 
w=ux0.01 

end 

end 
13: end 
14: end 


15: foreach Vx € Xtes¢ do 
16: the outcome of all transfer architectures, d € D 
17: end 


4.1. Generative Adversarial Network 


GANS consist of two different types of networks. Those networks are trained simultaneously. 
The first network is trained on image generation while the other is used for discrimination. GANs are 
considered a special type of deep learning models. The first network is the generator, while the 
second network is the discriminator. The generator network in this research consists of five transposed 
convolutional layers, four ReLU layers, four batch normalization layers, and Tanh Layer at the end of 
the model, while the discriminator network consists of five convolutional layers, four leaky ReLU, 
and three batch normalization layers. All the convolutional and transposed convolutional layers used 
the same window size of 4*4* pixel with 64 filters for each layer. Figure 5 presents the structure and the 
sequence of layers of the GAN network proposed in this research. 

The GAN network helped in overcoming the overfitting problem caused by the limited number of 
images in the dataset. Moreover, it increased the dataset images to be 30 times larger than the original 
dataset. The dataset number of images reached 8100 images after using the GAN network for 4 classes. 
This will help in achieving a remarkable testing accuracy and performance matrices. The achieved 
results will be deliberated in detail in the experimental outcomes section. Figure 6 presents samples of 
the output of the GAN network for the COVID-19 class. 
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Figure 5. The structure and the sequence of layers for the proposed GAN network. 


Figure 6. Samples of images generated using the proposed GAN structure. 


4.2. Deep Transfer Learning 


Convolutional Neural Networks (ConvNet) is the most successful type of model for image 
classification and detection. A single ConvNet model contains many different layers of neural networks 
that work on labeling edges and simple/complex features on neural network layers and more complex 
deep features in deeper network layers. An image is convolved with filters (kernels) and then max 
pooling is applied, this process may go on for some layers and at last recognizable features are obtained. 
Take the size of W"! x H’"! x Ch! (where W x H is width x height) feature map and a filterbank in 
layer | — 1 for example within C! kernels at the size of f! x C/“!, augmenting the other two coefficients 
stride s! and padding p!, the outcome feature box in layer | is W! x H! x C! as shown in Equation (2): 


wl Hi-1 2 i. ¥#! 
Alma tcitle ae de fil (2) 


(W!,H!) = | 7 


where [-] indicate to floor math. Kernels must be equal to that of the input map. as in Equation (3): 


| oe I-1 i 1 
A= ey, x 1 +04) (3) 
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where i and j are indexes of input/output network maps at a range of W! x H! and Wh! x H!-1 
respectively. V; here indicates the receptive field of kernel and bi is the bias term. In equation (3), 
o(.) isa non-linearity function applied to get non-linearity in deep transfer learning. In our transfer 
method, we used ReLU in equation (4) as the non-linearity function for rapid training process: 


o(Xinput) = max(0, ita) (4) 
Our cost function in Equation (5): 
L(s,t) = Las(Ser) + Alp* > O}Lreg (8, 8"), (5) 


where s,« is output label c* while g and g* denote [gx, Sy Sw gn of bounding boxes. A[p* > 0] consider 
the boxes of non-background (if p* = 0 is background). This cost function have detection loss L,;; and 
regression loss L;eg, in Equations (6)—(8): 


Los (sc) = log (sc) (6) 


and 
Lreg(&, g) = cea Rua(gi > 81) (7) 


where: 
05x, if |x|< 0 


|x| -0.5, otherwise 


Rr (x) = | (8) 

In terms of optimizer technique, the momentum Stochastic Gradient Descent (SGD) [62] with 
momentum 0.9 is chosen as our optimizer technique, which updates weights parameters. This optimizer 
technique updates the weights of the gradient at the previous iteration and fine-tuning of the gradient. 
To bypass deep learning network overfitting problems, we utilize this problem by using the dropout 
technique [63] and the early-stopping technique [64] to select the best training steps. As to the learning 
rate policy, the step size technique is performed in SGD. We introduced the learning rate (1) to 0.01 
and the number of iterations to be 2000. The mini-batch size is set to 64 and early-stopping to be five 
epochs if the accuracy did not improve. 


5. Experimental Results 


The introduced model was coded using a software package (MATLAB). The development was 
CPU specific. All outcomes were conducted on a computer server equipped by an Intel Xeon processor 
(2 GHz), 96 GB of RAM. The proposed model has been tested under three different scenarios, the first 
scenario is to test the proposed model for 4 classes, the second scenario for three classes and the third 
one for two classes. All the test experiment scenarios included the COVID-19 class. Every scenario 
consists of the validation phase and the testing phase. In the validation phase, 20% of total generated 
images will be used while in the testing phase consists of around 10% from the original dataset will 
be used. 

The main difference between the validation phase and testing phase accuracy is in the validation 
phase, the data used to validate the generalization ability of the model or for the early stopping, during 
the training process. In the testing phase, the data used for other purposes other than training and 
validating. The data used in training, validation, and testing never overlap with each other to build a 
concrete result about the proposed model. 

Before listing the major results of this research, Table 4 presents the validation and the testing 
accuracy for four classes before using GAN as an image augmenter. The presented results in Table 4 
show that the validation and testing accuracy is quite low and not acceptable as a model for the 
detection of coronavirus. 
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Table 4. Validation and testing accuracy for 4 classes according to 3 deep transfer learning models 


without using GAN. 
Model/Validation-Testing Accuracy ALexnet Googlenet Resnet18 
Validation Accuracy 73.1% 76.9% 67.3% 
Testing Accuracy 52.0% 52.8% 50.0% 


5.1. Verification and Testing Accuracy Measurement 


Testing accuracy is one of the estimations which demonstrates the precision and the accuracy of 
any proposed models. The confusion matrix also is one of the accurate measurements which give 
more insight into the achieved validation and testing accuracy. First, the four classes scenario will 
be investigated with the three types of deep transfer learning which include Alexnet, Googlenet, 
and Resnet18. Figures 7—9 illustrates the confusion matrices for the validation and testing phases for 


four classes in the dataset. 
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Figure 7. Confusion matrices of Alexnet for 4 classes (a) validation accuracy, and (b) testing accuracy. 
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Figure 8. Confusion matrices of Googlenet for 4 classes (a) validation accuracy, and (b) testing accuracy. 
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Figure 9. Confusion matrices of Resnet18 for 4 classes (a) validation accuracy, and (b) testing accuracy. 


Table 5 summarizes the validation and the testing accuracy for the different deep transfer models 
for four classes. The table illustrates according to validation accuracy, the Resnet18 achieved the 
highest accuracy with 99.6%, this is due to the large number of parameters in the Resnet18 architecture 
which contains 11.7 million parameters which are not larger than Alexnet but the Alexnet only include 
8 layers while the Resnet18 includes 18 layers. According to testing accuracy, the Googlenet achieved 
the highest accuracy with 80.6%, this is due to a large number of layers if it is compared to other models 
as it contains about 22 layers. 


Table 5. Validation and testing accuracy for 4 classes according to 3 deep transfer learning models. 


Model/Validation-Testing Accuracy ALexnet Googlenet Resnet18 
Validation Accuracy 98.5% 98.9% 99.6% 
Testing Accuracy 66.7% 80.6% 66.7% 


The second scenario to be tested in this research when the dataset only contains three classes. 
Figures 10-12 illustrate the confusion matrices for the validation and testing phases for three classes in 


the dataset including the Covid class. 
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Figure 10. Confusion matrices of Alexnet for 3 classes (a) validation accuracy, and (b) testing accuracy. 
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Figure 12. Confusion matrices of Resnet18 for 3 classes (a) validation accuracy, and (b) testing accuracy. 


Table 6 summarizes the validation and the testing accuracy for the different deep transfer models 
for 3 classes. The table illustrates according to validation accuracy, the Resnet18 achieved the highest 
accuracy with 99.6%. According to testing accuracy, the Alexnet achieved the highest accuracy with 
85.2%, this is maybe due to the large number of parameters in the Alexnet architecture which include 
61 million parameters and also due to the elimination of the fourth class which include the pneumonia 
virus which has similar features if it is compared to COVID-19 which is also considered a type of 
pneumonia virus. The elimination of the pneumonia virus helps in achieving better testing accuracy for 
the all deep transfer model than when it is trained over four classes as mentioned before as COVID-19 
is a special type of pneumonia virus. 


Table 6. Validation and testing accuracy for 3 classes according to 3 deep transfer learning models. 


Model/Validation-Testing Accuracy ALexnet Googlenet Resnet18 
Validation Accuracy 97.2% 98.3% 99.6% 
Testing Accuracy 85.2% 81.5% 81.5% 


The third scenario to be tested when the dataset only includes two classes, the covid class, and the 
normal class. Figure 13 illustrates the confusion matrix for the three different transfer models for 
validation accuracy, While the confusion matrix for testing accuracy is presented in Figure 14 which is 
the same for all the deep transfer models selected in this research. 
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Figure 14. Confusion matrix for testing accuracy for Alexnet, Googlenet, and Resnet18. 


Table 7 summarizes the validation and the testing accuracy for the different deep transfer models 
for two classes. The table illustrates according to validation accuracy, the Googlenet achieved the 
highest accuracy with 99.9%. According to testing accuracy, all the pre-trained model Alexnet, 
Goolgenet, and Resnet18 achieved the highest accuracy with 100%, This due to the elimination of 
the third and the fourth class which includes pneumonia bacterial and pneumonia virus which has 
similar features if it is compared to COVID-19. This leads to a noteworthy enhancement in the testing 
accuracy which reflects on whatever the deep transfer model will be used the testing accuracy will 
reach 100%. The choice of the best model here will be according to validation accuracy which achieved 
99.9%. So the Googlenet will be the selected deep transfer model in the third scenario. 


Table 7. Validation and testing accuracy for 2 classes according to 3 deep transfer learning models. 


Model/Validation-Testing Accuracy ALexnet Googlenet Resnet18 
Validation Accuracy 99.6% 99.9% 99.8% 
Testing Accuracy 100% 100% 100% 


To conclude this part, every scenario has it is own deep transfer model. In the first scenario, 
Googlenet was selected, while the second scenario, Alexnet was selected, and finally, in the third 
scenario, Googlenet was selected as a deep transfer model. To draw a full conclusion for the selected 
deep transfer learning that fit the dataset and all scenarios, testing accuracy for every class is required 
for the different deep transfer model. Table 7 presents the testing accuracy for every class for the 
different three scenarios. Table 8 does not help much to determine the deep transfer model that fits all 
scenarios but for the distinction of COVID-19 class among the other classes, Alexnet and Resent18 will 
be the selected as deep transfer model as they achieved 100% testing accuracy for COVID-19 class 
whatever the number of classes is 2,3 or 4. 
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Table 8. Testing accuracy for every class for the different 3 scenarios. 


# of Classes Class Name Alexnet Googlenet  Resnet18 

Covid 100% 100% 100% 

4 classes Normal 64.3% 100% 100% 
Pneumonia _bac 44.4% 70% 50% 
Pneumonia _vir 50% 66.7% 40% 
Covid 100% 81.8% 100% 

Sclasses = Normal 77.7% 75.0% 100% 
Pneumonia _bac 77.8% 87.5% 64.3% 

eateauce Covid 100% 100% 100% 
Normal 100% 100% 100% 


5.2. Performance Evaluation and Discussion 


To estimate the performance of the proposed model, extra performance matrices are required 
to be explored through this study. The most widespread performance measures in the field of deep 
learning are Precision, Sensitivity (recall), Fl Score [65] and they are presented from Equation (9) to 
Equation (11). 


TrueP 
Petco (9) 
(TrueP + FalseP) 
TrueP 
Sensitivity = 10 
wy (TrueP + FalseN) (10) 
Precision * Sensitivity 
Fi1Score = 2+ (11) 


(Precision + Sensitivity) 


where TrueP is the count of true positive samples, TrueN is the count of true negative samples, 
FalseP is the count of false positive samples, and FalseN is the count of false negative samples from a 
confusion matrix. 

Table 9 presents the performance metrics for different scenarios and deep transfer models for the 
testing accuracy. The table illustrates that in the first scenario which contains four classes, Googlenet 
achieved the highest percentage for precision, sensitivity and F1 score metrics which strengthen the 
research decision for choosing Googlenet as a deep transfer model. The table also illustrates that in the 
second scenario which contains three classes, Alexnet achieved the highest percentage for precision 
and recall score metrics while Resnet achieved the highest score in F1 with 88.10% but overall the 
Alexnet had the highest testing accuracy which also strengthens the research decision for choosing 
Alexnet as deep transfer model. 

Table 9 also illustrates that in the third scenario, which contains two classes, all deep transfer 
learning models achieved similar the highest percentage for precision, recall and F1 score metrics 
which strengthen the research decision for choosing Googlenet as it achieved the highest validation 
accuracy with 99.9% as illustrated in Table 6. 
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Table 9. Performance measurements for different scenarios. 


# of Classes Class Name Alexnet Googlenet Resnet18 

Precision 64.68% 84.17% 72.50% 

4 classes Recall 66.67% 80.56% 66.67% 
F1 Score 65.66% 82.32% 69.46% 
Testing Accuracy 66.67% 80.56% 69.46% 
Precision 85.19% 81.44% 88.10% 

3 classes Recall 85.19% 81.48% 81.48% 
F1 Score 85.19% 81.46% 84.66% 
Testing Accuracy 85.19% 81.48% 81.48% 
Precision 100% 100% 100% 

2 classes Recall 100% 100% 100% 
F1 Score 100% 100% 100% 
Testing Accuracy 100% 100% 100% 


6. Conclusions and Future Works 


The 2019 novel Coronaviruses (COVID-19) are a family of viruses that leads to illnesses ranging 
from the common cold to more severe diseases and may lead to death according to World Health 
Organization (WHO), with the advances in computer algorithms and especially artificial intelligence, 
the detection of this type of virus in early stages will help in fast recovery. In this paper, a GAN with 
deep transfer learning for COVID-19 detection in limited chest X-ray images is presented. The lack 
of benchmark datasets for COVID-19 especially in chest X-rays images was the main motivation of 
this research. The main idea is to collect all the possible images for COVID-19 and use the GAN 
network to generate more images to help in the detection of the virus from the available X-ray’s images. 
The dataset in this research was collected from different sources. The number of images of the collected 
dataset was 307 images for four types of classes. The classes are the covid, normal, pneumonia bacterial, 
and pneumonia virus. 

Three deep transfer models were selected in this research for investigation. Those models are 
selected for investigation through this research as it contains a small number of layers on their 
architectures, this will result in reducing the complexity and the consumed memory and time for 
the proposed model. A three-case scenario was tested through the paper, the first scenario which 
included the four classes from the dataset, while the second scenario included three classes and the 
third scenario included two classes. All the scenarios included the COVID-19 class as it was the main 
target of this research to be detected. In the first scenario, the Googlenet was selected to be the main 
deep transfer model as it achieved 80.6% in testing accuracy. In the second scenario, the Alexnet was 
selected to be the main deep transfer model as it achieved 85.2% in testing accuracy while in the third 
scenario which included two classes(COVID-19, and normal), Googlenet was selected to be the main 
deep transfer model as it achieved 100% in testing accuracy and 99.9% in the validation accuracy. 
One open door for future works is to apply the deep models with a larger dataset benchmark. 
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